Temporal Action Segmentation with High-level Complex Activity Labels
نویسندگان
چکیده
The temporal action segmentation task segments videos temporally and predicts labels for all frames. Fully supervising such a model requires dense frame-wise annotations, which are expensive tedious to collect. This work is the first propose Constituent Action Discovery (CAD) framework that only video-wise high-level complex activity label as supervision segmentation. proposed approach automatically discovers constituent video actions using an classification task. Specifically, we define finite number of latent prototypes construct video-level dual representations with these learned collectively through training. setting endows our capability discover potentially shared across multiple activities. Due lack action-level supervision, adopt Hungarian matching algorithm relate ground truth semantic classes evaluation. We show can be extended from existing levels global level. global-level allows sharing activities, has never been considered in literature before. Extensive experiments demonstrate discovered help perform recognition tasks.
منابع مشابه
Semiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملSOL: Segmentation with Overlapping Labels
Image segmentation is a fundamental problem in Computer Vision which involves segmenting an image into two or more segments. These segments usually correspond to objects of interest in the image, i.e. liver, kidney’s etc. The classic approach to this problem segments the image into mutually exclusive segments. However, this approach is not well-suited when segmenting overlapping objects, e.g. c...
متن کاملInteractive Segmentation with Super-Labels
In interactive segmentation, the most common way to model object appearance is by GMM or histogram, while MRFs are used to encourage spatial coherence among the object labels. This makes the strong assumption that pixels within each object are i.i.d. when in fact most objects have multiple distinct appearances and exhibit strong spatial correlation among their pixels. At the very least, this ca...
متن کاملsemiautomatic image retrieval using the high level semantic labels
content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. the challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملHigh Level Fuzzy Labels for Vague Concepts
Vague or imprecise concepts are fundamental to natural language. Human beings are constantly using imprecise language to communicate each other. We usually say ‘John is tall and strong’ but not ‘John is exactly 1.85 meters in height and he can lift 100kg weights’. Humans have a remarkable capability to perform a wide variety of physical and mental tasks without any measurements. This capability...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Multimedia
سال: 2023
ISSN: ['1520-9210', '1941-0077']
DOI: https://doi.org/10.1109/tmm.2022.3231099